The overlap number of a graph 
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Abstract 



An overlap representation is an assignment of sets to the vertices of a graph in such a way 
that two vertices are adjacent if and only if the sets assigned to them overlap. The overlap 
number of a graph is the minimum number of elements needed to form such a representation. 
We find the overlap numbers of cliques and complete bipartite graphs by relating the problem 
to previous research in combinatorics. The overlap numbers of paths, cycles, and caterpillars 
are also established. Finally, we show the NP-completeness of the problems of extending an 
C_) , overlap representation and finding a minimum overlap representation with limited containment. 

^ ! 1 Introduction 

o ; 

C**"* ■ All graphs we consider are finite and simple. The subgraph of G = (V, E) induced by V' C V is 

denoted by G[V']. For graph G = (V,E) and vertex v € V, N(v) = {u \ (u,v) £ E} is called the 
open neighbourhood of vertex v; N[v) = N(v) U {v} is the closed neighbourhood of vertex v. The 
minimum degree of any vertex in a graph G is denoted S(G). A set of vertices is a clique if any two 
vertices in the set are adjacent. A clique is maximal if it is not contained in a larger clique. An 
independent set is a set of pairwise nonadjacent vertices. We denote by K n the complete graph on 
n vertices. We use the notation P n , for the path on n vertices, and C n for the cycle on n vertices. 
We also need to introduce two covers of a graph: a clique cover is a covering of the vertices of a 
graph by cliques, and an edge-clique cover is a covering of the edges of a graph by cliques. 

Two sets overlap if they intersect and neither set contains the other. An overlap representation 
(respectively intersection representation) for a graph G is an assignment of sets to the vertices of 
G such that two vertices are adjacent if and only if the sets assigned to them overlap (respectively 
intersect). The size of such a representation is the cardinality of the union of the assigned sets, 
and the minimum size of a representation is termed the overlap number (respectively intersection 
number) of the graph. The overlap number of graph G is denoted <p(G). Every graph has an overlap 
representation. This follows from the fact that all graphs have intersection representations [16| (for 
an English translation, see [2]), and the observation that we can take an intersection representation 
for a graph and add a new element to each set, which causes sets to overlap if and only if they 
intersect. While intersection representations of graphs have been widely studied, overlap represen- 
tations have received considerably less attention, even though overlapping is a natural relation of 
pairs of sets. 
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The intersection number parameter was introduced and bounded by Erdos, Goodman, and 
Posa [3]. Specifically, they give a tight upper bound of [n 2 /4j on the intersection number of 
an n-vertex graph. The NP-completeness of computing the intersection number is shown by 
Kou, Stockmeyer, and Wong [9]. In addition to these results, Raychaudhuri gives polynomial 
time algorithms for the intersection numbers of chordal graphs |12j and WVfree comparability 
graphs [13], where W4 is a cycle on four vertices with a universal vertex. 

For the overlap number, only a few algorithms and bounds are known. Using the simple 
technique of adding a new vertex to each set in the representation, the intersection number bound 
of [1] shows that tp(G) < [n 2 /4j + n for any graph G with n vertices. It is known that any 
cocomparability graph G on n vertices has tp(G) < n + 1, since a containment representation of 
size n + 1 exists for G in which every set has a common element [B] , and such a representation is an 
overlap representation for G. Thus Ky_^ has intersection number [n 2 /4j [3j and overlap number 
at most n + 1 since it is a cocomparability graph. Any graph with at most a linear number of 
maximal cliques must have linear intersection number, as a minimum intersection representation 
and a minimum edge-clique cover have the same size [4]. This implies that a graph with a linear 
number of maximal cliques must also have linear overlap number, since we can add a new element 
to each set of a minimum intersection representation. This technique yields linear bounds on 
the overlap number of trees, chordal graphs, and planar graphs, using previous linear bounds on 
the number of maximal cliques in chordal graphs [5] and planar graphs [TT]. Henderson [7j gives 
lower bounds on the overlap number of a graph in terms of its independent sets, and constant 
factor approximation algorithms for the overlap numbers of trees and planar graphs. In addition, 
he shows that there exist graphs with overlap number quadratic in the number of vertices of the 
graph. Cranston et al [3] show how to compute the overlap number of a tree in linear time, and give 
upper bounds on the overlap numbers of some graphs. Their results include the following bounds 
which are satisfied with equality for some graphs: tp(G) < \E\ — 1 where G = (V,E), G 7^ K%, and 
5(G) > 2, and tp(G) < n 2 /4 — n/2 — 1 where G is an n- vertex graph with n > 14. 

In this paper we present hardness results for problems related to finding the overlap number, 
and give formulas and describe algorithms for the overlap numbers of some simple graphs. These 
results appeared in the first author's Master's thesis |14j . 

The remainder of this paper is organized as follows. In Section [2] we formally introduce the 
overlap number and discuss some of the basic properties of overlap representations. We follow this 
with Section [3j where we give formulas and describe algorithms for the overlap numbers of some 
graphs. Finally, in Section HJ we present some NP-completeness results on problems related to the 
overlap number. 

2 The Overlap Number of a Graph 

Before formalizing the definition of an overlap representation, we note that by a collection we refer 
to a multiset of sets, and for simplicity we allow the mapping from sets of a representation to 
vertices of a graph to remain implicit, with the set S v associated with the vertex v of the graph. 
With these notational issues resolved, we define an overlap representation in the following way. 

Definition 1. Given a graph G = (V, E), a collection C = {S v : v £ V} is an overlap representation 
for G if for every u, v £ V we have 

(■u, v) G E if and only if S u n S v ^ 0, S u % S v , and S v <2 S u . 
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{1,3} {3,4} 




{2,3} {1,2,4} 

Figure 1: Example of a minimum overlap representation. 

We define the size of a representation to be the number of elements used in the representation, which 
is lU^ey ^1 ' an d we l e t the overlap number, <p(G), be the size of a minimum overlap representation. 

As can be seen from the definition, overlap representations have many similarities to intersec- 
tion representations: sets assigned to adjacent vertices must intersect and disjoint sets map to 
nonadjacent vertices. In the overlap case, however, the situation is more complex, as not only do 
we need to ensure a stronger condition than intersection for adjacent vertices, we have a choice of 
representation, for every non-edge, of disjointedness or containment. As an example, consider the 
representation in Figure [U where there are nonadjacent vertices represented in each of these ways. 

In the case of an intersection representation, if we take an element of the representation and 
examine all those sets it is contained in, we find that the vertices associated with them form a 
clique. Doing the same in an overlap representation once again leaves us with a collection of 
vertices with intersecting sets, except here we may have non-edges represented by containment, and 
so, since the orientation implied by set containment forms a partial order, we can map elements of 
the representation to cocompar ability graphs. Unfortunately, while covering all edges of a graph 
with cliques leads to an intersection representation, if we simply cover the edges of a graph with 
cocomparability graphs, we do not generally end up with a valid overlap representation (most 
cocomparability graphs do not have overlap number one). 

Observation 2. If {S v : v G V} is an overlap representation for G = (V,E) then, for any V C V, 
{S v : v S V'} is an overlap representation for G[V']. Thus ip(G) > <p(H) for all induced subgraphs 
H ofG. 

Vertex multiplication is the expansion of a vertex into an independent set, such that the vertices 
of the independent set have the same adjacencies as the original vertex. 

Observation 3. If there is an overlap representation of size s for graph G then there is an overlap 
representation of size s for graph G' , where G' arises from G by vertex multiplication. 

This can be observed by simply duplicating the set assigned to a vertex when it is multiplied. 
Note that the size of an intersection representation is preserved by the operation of expanding a 
vertex to a clique but not by vertex multiplication. 

We conclude this section with three results that will be used in the next section. 

Lemma 4. If A, B, and C are three sets such that A C C , A overlaps B, but B does not overlap 
C, then B C C. 
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Proof. Since A and B overlap, we have A n B ^ and A\B 7^ which, combined with the fact 
that AQC, imply CC\B ^$ and C\B / 0. Now since B does not overlap C, B C C. □ 

We amplify this lemma to the following stronger result that we use in Section 13.31 to argue 
bounds on the size of an overlap representation for a graph, based on the size of representations 
for the connected components of the graph. 

Lemma 5. Let G = (V, E) be a graph with overlap representation C = {S v : v G V}. Let X and Y 
be nonempty subsets ofV such that each of G[X] and G[Y] is connected, and no edge of E has one 
endpoint in X and the other in Y. Let Ux = \J x< =x ^ an< ^ = Uygy $y U &x ^ $y f or some 
x G X , y G Y , then 

(i) for all y G Y , Ux Q S y or Ux H S y = 0, and 

(ii) if \X\ > 1 or \Y\ > 1 then for all x G X, y €Y, S y <2 S x . 
Proof. We first show that, for all x G X, y G Y, 

(1) if S x C then f/ x C 5 y , (2) if 5 y C S x then ?7y C S x , and (3) if 5^ = S y then |X| = \Y\ = 1. 

Suppose (1) is false. Let x' G X be such that 5 X / contains an element that is not in S y , where 
the distance in G[X] from x to x' is as small as possible. Let x = x\x<i . . . x^ = x' be a shortest 
x, a/-path in Then S Xk l C S 1 ^ (by the choice of x'), 5 , X( ._ 1 overlaps S x ' (because Xk-i and x' 

are adjacent on the path), and S x > does not overlap S y (since x' and y are not adjacent in G). But 
then by Lemma 01 S x i C Sy, contradicting the choice of x'. The justification for (2) is similar. For 
(3), if S x = S y then x and y are adjacent to exactly the same vertices in G, and therefore, since 
each of G[X] and G[Y] is connected, and there are no edges between X and Y in G, \X\ = \Y\ = 1. 

Note that (i) is true if \X\ = \Y\ = 1 since then Ux = S x Q S y for x G X, y G Y". Now suppose 
(i) is false. Then |X| > 1 or |Y| > 1 and there exists y G Y with f/x ^ 5 y and Ux H 5y 7^ 0. 
By (1), there is no x G X with C 5 y . Therefore, C S x i for some x' G X. We also have 
S x Q S y ' for some x G X, y' G Y, by the statement of the lemma. But now, by (1) and (2), 
Ux C S y ' C C/y C 5a;/ C [/jf, which implies S* x ' = ,Sy, contradicting (3). 

If (ii) is false, we again have S x C S 1 ^/ and Sy C S x ' for some x, x' G X and y,y' G V which, 
together with (1) and (2), contradicts (3). □ 

In Section 13.21 we use the following simplification of Lemma [5] to argue bounds on the overlap 
numbers of paths, cycles, and caterpillars. 

Corollary 6. Let G = (V,E) be a graph, and let C = {S v : v G V} be an overlap representation 
of G. Fix v G V, and let, for u G V\N[v], A v (u) be the vertex set of the connected component of 
G[V \ N[v]] that contains u. If S u C S v , then [j we A v (u) &w S S v . 

3 Minimum Overlap Representations 

In this section, we give formulas for the overlap numbers of cliques, complete /c-partite graphs, 
paths, cycles, and caterpillars and for the overlap number of a disconnected graph in terms of the 
overlap numbers of its connected components. 
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3.1 Cliques and Complete A>partite Graphs 



An overlap representation of a clique is simply a collection of sets where no set contains any other 
and no two sets are disjoint. We can apply a theorem of Milner to find the minimum size of such 
a representation. 

Definition 7. The maximum size of a family, C, of subsets of {1, 2, . . . , to} satisfying, for p > 0, 

1. UA,B€ C, with A^B, then A £ B, 

2. If A, B € C, then \A(~)B\> p, 
is denoted S(p,m). 

The value of the function S(p, to) is exactly the quantity given by Milner's Theorem, first 
published in 1966. 

Theorem 8 (Milner [TO]). For m > 1 and p > 0, 



Also noted by Milner [lOj . is that it is easy to construct a collection that achieves this bound, 
by simply choosing all subsets of {1, 2, . . . , m} of size [(m + p + l)/2j . This, reformulated in the 
language of overlap representations, is precisely the content of the following corollary. 

Corollary 9. For n > 1, ip(K n ) = min{m : n < 5(1, m)}. 

Proof. Consider any overlap representation, B, of K n . Any two elements of B must intersect, and 
no element can contain any other, as each pair of vertices in K n forms an edge. Thus \B\ < 5(1, m), 
where m = |LUeB^I - 

For any m, consider the collection given by C m = {A C {1, 2, . . . , m} : \ A\ = \_{m + 2)/2j }. As 
we have [(m + 2)/2j > [m/2] , any two elements of C m form an intersecting pair, and furthermore, 
no element is contained in any other, as they all have the same size. Counting the number of ways 
to form subsets of {1, 2, ... , to}, we obtain 



Then, to find the minimum representation, we seek the minimum to that leaves enough room to 
form an overlap representation. We can simply choose any n elements of C m to obtain an overlap 
representation on to elements. Thus, ip(K n ) is the smallest m such that n < \C m \ = 5(1, m), as 





desired. 



□ 



The next result follows immediately from Corollary [9] and Observation [3l 



Corollary 10. If G is a complete k-partite graph, then ip(G) 



min{m : k < 5(1, m)}. 
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We now investigate some computational issues involved in finding overlap representations of 
cliques and /c-partite graphs. This is done by first finding bounds on (p(K n ) in terms of n which, 
together with the constructive proof of Corollary [91 yield a simple polynomial time algorithm to 
produce a minimum overlap representation of K n . 

In order to gain a view of the size of the required representation for a given graph, we unwind 
the expression 

f / m 

nun jm : n < (^ m + 2 

to obtain an asymptotically tight bound on ip(K n ) in terms of n. We make use of Stirling's 
Approximation, which can be found, for example, in pQ: 



V2^(n/e) n <n\<e 1/{12n) V2^(n/e) n . (1) 
This results in the following lemma. 
Lemma 11. For 1 < k < n, 



n \ I 1 /n\ k f n 



k) V Strk \kJ \n — k 



n—k 



Proof. The inequality follows by substituting equation (Q]) into the expansion of the binomial coef- 
ficient. □ 

Using this lemma, we bound the size of the minimum overlap representation of the graphs we 
have considered. The proof here is simply a calculation and is omitted. 

Theorem 12. For n > 1, 

m 



min jm : n < M ro+2jj j G QQo&n). 

The next result is the final ingredient needed to build an efficient algorithm to find, for a given 
n, the minimum m such that n < 5(1, m). 



Proposition 13. For any m > 2, 



5(1, m) 



2m 



m - 
2m 



j5(l, m — 1) if m is odd, 



, 2 -5(l,m— 1) if m is even. 
Proof. The proof makes use of the following identities on binomial coefficients 

n — 1 
k 







n I 


I) 


n 


-k\ 




n 


fn — 


I) 


k 


\k- 



which can be found, for example, in [8]. □ 
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Using the recurrence of Proposition 1131 we can compute 5(1, m) for successive values of m, 
until we find tp(K n ), the smallest m such that n < 5(1, m). By Theorem 1121 this produces an 
algorithm with runtime in O(logn). To compute a minimum representation for K n , as noted in 
the proof of Corollary [9l we simply take any n of the subsets of {1, 2, . . . , ip(K n )} of cardinality 
[(<p(K n ) + 2)/2j . Since cp(K n ) G O(logra) (by Theorem El Corollary [9] and Theorem [T2|) . n of these 
subsets can be found in 0(n) time. These algorithms can also be immediately extended to find 
representations for complete n-partite graphs, as described in Observation [3l 

3.2 Paths, Cycles, and Caterpillars 

A minimum intersection representation for a path is simple to find, as there is only one possible 
edge-clique cover, the one consisting of each maximal clique. While it is essentially no harder 
to find an overlap representation of a path, proving the optimality of the representation is more 
difficult. Once we have shown the size that an overlap representation of a path must have, we 
extend the result to the case of cycles and caterpillars. The construction, given as part of the proof 
of the following theorem can be immediately transformed into a efficient algorithm for generating 
an overlap representation of P n . This theorem does not hold for P2, as y>(P2) = 3. 

Theorem 14. For n > 3, <p(P n ) = it- 
Proof. For n = 3, we observe that {{1, 2}, {2, 3}, {1, 2}} is a minimum overlap representation, 
since we need at least three elements to represent a single edge. We now show that, for n > 4, 
L p{Pn) > <p(Pn-i) + 1; thereby proving that <p(P n ) > n - For n > 4, let 1, 2, . . . , n be the vertices 
of P n in the order in which they appear on the path, and let C = {Si, 52, ... , 5 n } be a minimum 
overlap representation for P n . By Corollary [6l either 5i contains none of {53,54, . . . ,5 n }, or it 
contains each 5j for i > 3. Notice also that these two cases collapse, since if S\ contains all 5, for 
i > 3, then in particular, S n C 5i, and so, if we consider the reversal of the path, we find that S n 
contains none of the other sets, since n > 4. We thus need only consider the first case. 

To this end, let the representation, without loss of generality, be such that the set 5i contains 
none of {53, ... 5 n }. Notice that with the exception of 52, the elements of S\ are either all contained 
in one of the other sets, or none of them are. We form a representation for P n -i where these elements 
are compressed into a single element. We consider the collection given by C = {S2 U 5i, 53, . . . 5 n }. 
As 5i and 52 share at least one element, Sk does not overlap 52 U Si for any k > 4. To see this, 
we consider two cases. The first case is that Si C Sf~, but then, by Lemma HI we have 52 C Sk 
as well, which implies that Si U 52 C Sk, as desired. In the other case we have Si disjoint from 
5^ , but in this case we can observe that 52 % Sk, as this would imply, again by Lemma that 
Si C Sk- Since 52 % Sk, it is either disjoint from Sk, in which case Si U 52 is as well, or Sk C 52, 
which implies that Sk C Si U 52, as required. Similarly, in the collection C" = {S2 \ Si, S3, . . . , 5 n }, 
a case analysis shows that the set 52 \ Si does not overlap any set Sk for k > 4. 

Thus, we need only verify that one of these two collections preserves the overlap between 53 and 
the replacement for 52- To see that at least one suffices, let C fail to be an overlap representation 
for P n -i, which implies that 53 C 52 U Si, as we have only enlarged 52- Thus, Si H S3 ^ 0, as 
53 is contained in neither S\ or 52, but it is contained in their union. Also, since S\ does not 
contain any other set in the representation, we must have Si C 53 as these two vertices are not 
adjacent in the path. Notice also that, since 53 is contained in Si U 52, and 53 is not contained 
in Si, we must have 53 n (52 \ Si) 7^ 0. Seeking a contradiction, we assume that 53 and 52 \ Si 
also do not overlap. Since C is an overlap representation for P n , we must have 53 % S2 \ Si C 52, 



7 



as the vertices associated with S2 and £3 are adjacent. This leaves only one way for S3 to fail to 
overlap S2 \ <Si, which is (S2 \ Si) C S3. If this is the case, then we have S2 C S3, as we know that 
Si Q S3, which we derived from the failure of C This contradicts the fact that C is an overlap 
representation for P n , and so one of C and C" must form a valid representation for P n —\. In both 
of these representations, each set either contains S\ or is disjoint from it, and so there is no loss in 
replacing the elements of S\ with a single element. This reduces the size of the representation by 
at least one, as a set needs at least two elements to overlap another set. Hence, we have formed a 
representation of P n ~i of size at most (f(P n ) — 1- By induction on n, we have shown that 

<p(P n ) > 1 + f(Pn-i) = 1 + n - 1 = n. 

To finish the proof, it is sufficient to build a representation of this size. Consider the represen- 
tation for P n given by, for 1 < i < n — 1, 

Si = + 

S n = {l,2,...,n-l}. 

Notice that in this representation, on the first n — 1 vertices, the set Si overlaps only the sets Si-i 
and Si+i and is disjoint from the other sets, with the exception of S n . Also, S n contains all sets 
except S n -i, which it overlaps, and so this is an overlap representation for P n using n elements. 
This proves that <p(P n ) = n. □ 

The representation used in the proof of the theorem is optimal in the number of elements used, 
and can be constructed in 0(n) time, which is asymptotically optimal, as a representation needs 
to have linear size. Thus we can view this construction as an efficient algorithm to find an overlap 
representation of a path. 

Having found the overlap number of a path, we can find immediate lower bounds on the size 
of the overlap representation for some other simple graphs. The first of these is C n , the cycle 
on n vertices. Once again, the lower bound is matched by a simple construction, which can be 
transformed immediately into an algorithm with running time linear in n. This result is not true 
for n = 3, as 92(63) = 3. 

Corollary 15. For n > 4, ip(C n ) = n — 1. 

Proof. To see that <p(C n ) > n — 1 we simply observe that by Theorem 1141 the size of the repre- 
sentation for any n — 1 of the n vertices is at least n — 1, and so it remains only to construct a 
representation using n — 1 elements. We do this by setting, for 1 < i < n — 2, Si = {i, i + 1}, 
which forms an overlap representation for a path of n — 2 vertices, using n — 1 elements. We add 
to this representation S n -\ = {1, 2, 3, . . . ,n — 2} and S ri = {2, 3, 4, . . . , , n — 1}, noting that S n -i 
overlaps only S n and 5 n _2, containing the other sets, and that S n overlaps only S n -\ and Si as it 
contains all other sets in the collection. Thus, the collection C = {Si, S2, ■ ■ ■ , S n } forms an overlap 
representation for C n using n — 1 elements, proving that <p(C n ) = n — 1. □ 

We next consider overlap representations of caterpillars. A tree is a caterpillar if the non-leaf 
vertices form a path, known as the spine of the caterpillar. We use Theorem [TU to find a lower 
bound on the size of an overlap representation for a caterpillar, and pair this result with a simple 
construction to show that the bound is tight. 
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Corollary 16. For a caterpillar T with spine containing k > 1 vertices, <f(T) = k + 2. 

Proof. We show that the size of a minimum overlap representation for a caterpillar has size de- 
termined by the size of the overlap representation for the longest path in the caterpillar. Let T 
be a caterpillar, and label the vertices of the spine in order {1, 2, . . . , k}, and let Li be the leaves 
connected to vertex i of the spine. Notice that any longest path in T has a vertex in L\ and a 
vertex in L/% as endpoints, with the remaining vertices being those of the spine. This allows the 
above labelling scheme to be implemented in linear time, as a longest path in a tree can be found 
in linear time. Also notice that the longest path in T contains k + 2 vertices, and so Theorem [TH 
provides a lower bound of <p(T) > k + 2. 

To show a tight bound, we need only find a representation of the correct size. The representation 
used is similar to the one used in the proof of Theorem 1141 For T a caterpillar, with nodes labelled 
1,2, ... k that form the spine, with node i adjacent to nodes i — 1 and i + and L, the set of leaves 
adjacent to vertex i, consider the representation given by, for 1 < i < k, 

Si = {i + l,i + 2} 
S Lt = {1,2,..., i + 

where the set Sl- is associated with all vertices in Lj. This representation coincides with the one 
previously given for paths, since viewing a path on n vertices as a caterpillar produces a caterpillar 
with n — 2 vertices on the spine, and two leaves, one on each end of the path. To see that the 
given representation is correct, notice that two vertices of the spine i and j overlap if and only if 
\i — j\ = 1. Notice also that the sets assigned to two leaves never overlap, as Sl { overlaps all Slj 
for j < i. In addition, Sl { overlaps only Si, since S^ contains Sj for j < i, and Si i is disjoint from 
Sj for j > i. This proves that <p(T) = k + 2. □ 

This representation can be efficiently constructed in the sum of the sizes of the sets of the 
representation, which is O(nk). 

3.3 Disconnected Graphs 

In this section we examine the size of a minimum overlap representation for a disconnected graph 
based on the sizes of minimum overlap representation of each of the connected components of the 
graph. This allows us to find the minimum overlap representation of a graph composed of the 
pieces we have already studied, and may lead to divide and conquer algorithms to find the size of 
an overlap representation for graphs such as threshold graphs and cographs that can be defined in 
terms of decomposition schemes. 

Theorem 17. If G is a graph with connected components B\,Bi, ■ ■ ■ ,-Bfc, then 

k 

v {G) = Y J V>{B i )-{k-l). 

i=l 

Proof. If k = 1, the theorem is trivially true. We assume that all components of G have size at 
least two, as isolated vertices can be added to a nonempty graph without increasing the size of the 
overlap representation, by assigning the isolated vertex a set consisting of any single element. In 
the case that G consists only of isolated vertices, the theorem is also trivially true. To prove this 
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theorem we first, as before, show a lower bound, and then argue that a representation achieving 
this lower bound must exist. 

Suppose k = 2. By Lemma[5j the two components must either be independent, with no elements 
in common in the overlap representation, or some sets of one component can contain all sets of 
the other. If the two components are independent then <p(G) = <p(Bi) + tp(B2). In the other 
case, assume without loss of generality that some set associated with a vertex of B2 contains a set 
associated with a vertex of B\. Thus, by Lemma any set associated with a vertex of B2 that 
intersects the set U of elements in the union of the sets associated with the vertices of B±, must 
contain all of U. In this case the elements of U may be considered to act as a single element and 
so, given a minimum overlap representation for G, we can take the representation restricted to B2 
and replace the elements of U by a single new element, resulting in an overlap representation for 
B2 of size ip(G) — tp(Bi) + 1. Therefore, tp(B 2 ) < (p(G) — tp{B\) + 1, which yields the desired bound 



In the case that k > 3, we consider a minimum overlap representation C = {S v : v G V}, and 
once again show a lower bound on the size of C. Take any three components with vertex sets A, B, 
and C. If some set associated with a vertex of A is contained in a set, S& for b G B, and some 
set associated with not necessarily the same vertex of A is contained in S c for c G C, then, by 
Lemma[5]the sets Sb and S c must contain \J a€A S a . In particular, Sb and S c intersect, and so one 
set must contain the other, as they are sets associated with nonadjacent vertices in G. This forces 
a containment relationship between B and C, so that the set associated with any vertex of A is 
forced to be contained in the sets associated with the vertices of one of B or C by transitivity. 
To see how this observation is useful, we build a graph F' , where the vertices of the graph are 
components in G, and two vertices A and B are connected by a directed edge if there is some 
vertex a G A and b G B such that S a C Sb in C. Notice that by Lemma 02 each pair of vertices 
is either nonadjacent, or connected by one directed edge. The above observation is then simply 
the observation that no vertex, v, of F' is connected to two nonadjacent vertices by edges directed 
away from v. This implies that if we take the transitive reduction of F', we obtain a graph with no 
cycles, and this graph remains acyclic even if we discard the orientation of the edges. Let F be the 
directed forest resulting from this transitive reduction. Since the edges of F represent containment 
and no vertex is connected by directed edges to two nonadjacent vertices, each tree has a unique 
root that all edges of the tree are directed towards. 

As in the case that k = 2, if two components Bi and Bj are related by containment such that 
Si C Sj for some i G Bi,j G Bj, the elements of U = LWb- &v function as a single element in the 
representation for the vertices of Bj, which is otherwise unrestricted. Thus if we take an overlap 
representation of these two components we are able to find a representation that is at most one 
element smaller than the representation of the two components by disjoint sets. Notice that we 
can save this one element once for every edge of F, as these edges count exactly the containment 
relationships that are not forced by transitivity. The largest number of edges F can have is one 
fewer than the number of components of G, as there must be some root vertex that is not connected 
by a directed edge to any other vertex. This provides the following lower bound, 
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To show that a representation exists that achieves this bound, we take a minimum overlap 
representation for each component Bi of G, such that any two of these representations are disjoint. 
We then, for each i in increasing order, create a containment relationship between Bi and -Bi+i, 
by choosing an arbitrary element of the representation for Bi + \ and replacing it with the union 
of all elements used in the representation of Bi. The resulting representation is a valid overlap 
representation for G, as we have replaced elements in such a way as to not affect the overlapping 
properties within a component, and, given any two components, if two sets of the representations 
associated with them have nonempty intersection, then one set must contain the other, so that there 
are no adjacencies created between components. Notice that this representation has size given by 
Equation ([2]), as we have taken optimal representations for each component, and removed exactly 
k — 1 elements, and so this is an optimal overlap representation for G, of size Yli=i fi^i) ~ — 1)> 
which proves the theorem. □ 

4 Hardness Results 

In this section we present some NP-completeness results for problems related to finding the mini- 
mum overlap representation of a given graph. 

4.1 Extending a Representation 

A natural approach to finding the overlap number for a graph is to employ a greedy strategy, adding 
one vertex at a time, and only making changes to the set associated with the newly added vertex. 
Unfortunately, this is not a feasible approach for a general graph and overlap representation, as 
the problem of deciding whether or not a new element needs to be added to the representation is 
NP-complete. The formal statement of this decision problem is as follows. 

Problem. The Overlap Extension problem is defined as: 

Instance: A graph, G = (V,E), an overlap representation C = {S v : v G V} of G, and a set 
AQV. 

Question: Is there a set S C U„ 6 y S v that overlaps S v if and only if v E A? 

Since such an extension can be efficiently verified, this problem is in NP. To see that the related 
problem on intersection representations can be solved efficiently, notice that in the intersection case, 
an element i of the representation can be added to S if and only if the set A contains all vertices v 
with i € S v . If all elements that can be added to S fail to form an intersection representation for 
the extended graph, then without introducing a new element, no such extension is possible. 

Returning to the overlap case, the problem that we reduce to Overlap Extension is the Not- 
All-Equal 3SAT problem, which is identical to the standard 3SAT problem, with the exception 
that we seek a satisfying truth assignment where no clause has all true literals. This problem is 
known to be NP-complete [15j . 

Theorem 18. Overlap Extension is NP-complete. 

Proof. Let (U, F) be an instance of Not- All-Equal 3SAT, where U = {x±, X2, • • • , x n } is the set 
of variables, and F = {ci, C2, . . . , c m } is the set of clauses with \a\ =3, for each i. If n < 4, we can 
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examine all possible truth assignments to determine if there is a solution to the Not- All-Equal 
3SAT instance, and output a trivial yes or no instance of Overlap Extension. 

If n > 4, we construct a graph G = (V, E), an overlap representation C of G, and a set A, to 
form an instance of Overlap Extension. The vertices in the graph are given by 

V = {vi : 1 < i < n} U {wi : 1 < i < m}, 

where each v i is associated with a variable Xi E U, and each Wi is associated with a clause a G i 7 . 
We take the overlap representation representation C given by 

C = {S Vi = {xi, -iXj} : Xi £U}U {S Wi = Ci : Ci £ F}, 

and we set E to those edges consistent this representation. Finally we let A = V to complete 
the instance (G,C,A) of Overlap Extension. This transformation can clearly be performed in 
polynomial time. A solution of the extension problem is a set of literals that overlaps each set in 
C, and we show that such a set is equivalent to a satisfying truth assignment for (U,F) in which 
each clause has at least one false literal. 

To see this, let S C U U {—<x : x £ U} be a set that overlaps all elements of C. Since S overlaps 
each S Vi = S must contain exactly one element of S Vi , and so we consider the truth 

assignment T that makes each literal in S true. In addition, S overlaps each S Wi = Ci, which forces 
at least one, but not all, of the literals in q to be contained in S, which shows that T satisfies the 
clause Ci without making all literals true. 

In the other direction, we take any truth assignment T that satisfies (U, F) without making 
all literals in any clause true, and consider the set S of all literals made true by T. Since T is a 
truth assignment, for each 1 < i < n, S contains exactly one of Xi and -iXj, and so S overlaps S Vi 
for all i. Furthermore, since T is a satisfying truth assignment, S must intersect each S Wi , and it 
cannot contain any S Wi , as this would imply that T satisfies all literals of each clause q. Finally, 
\S Wi \ — | | = 3, and IS"! = \U\ > 4, so S Wi cannot contain S for any i. This implies that S overlaps 
S Wi for all 1 < i < m, and so S is a solution to the instance of Overlap Extension. □ 

Using a similar reduction, we can show the hardness of the problem of the Containment 
Extension problem, which is the analogue of the Overlap Extension problem on containment 
representations. In this case the reduction is from the well-known NP-complete 3SAT problem. 

Theorem 19. Containment Extension is NP-complete. 

Proof. Let (U,F) be an instance of 3SAT, where U = {xi,X2, ■ ■ ■ ,x n } is a set of n variables, 
and F = {c\, C2, . . . , c m } is a set of m clauses, each containing three literals. We may once again 
consider only the case where n > 4, as the reduction can output a trivial yes or no instance if this 
is not the case. 

The vertices of the constructed graph G = (V, E) are given, similarly to the Overlap Exten- 
sion case, by 

V = {vi : 1 < i < n} U {wi : 1 < i < m} U {z}. 

We set L = \J1 = i{xi,^Xi}, the set of all literals, and construct the containment representation 
given by the collection C consisting of the following sets, for all 1 < i < n and 1 < j < m, 

S Vi — {xi,~ 'Xi\ 
S Wi = (L\ Ci )U{0} 
s z = {0}. 
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To complete the constructed instance, we set A = {z}. 

In a similar way to the proof of Theorem fl8l it can be observed that there is a set extending 
this containment representation if and only if the original instance of 3SAT has a satisfying truth 
assignment. The idea is that any such set S must contain the element 0, and so S cannot be 
contained in L\ci for any clause Cj, which is exactly the requirement that S contains a literal in q. 
In addition, the truth assignment given by S must form a valid partial truth assignment, since S 
can contain at most one of each pair of literals, as it cannot contain any set S Vi . The other direction 
is again similar to the proof of Theorem [18| as the set of all literals a satisfying truth assignment 
makes true is a valid extension of the containment representation. □ 

4.2 Containment-Free Representations 

In the remainder of this section, we consider the problem of finding a minimum overlap represen- 
tation where no set is contained in any other, or where the number of set containments is limited. 

Problem. The CF-Overlap Number problem is defined as: 

Instance: A graph, G = (V,E), and a natural number k. 

Question: Does the graph G have a containment-free overlap representation of size k? 

In the absence of containment, the definitions of overlap and intersection coincide, and so this 
problem is equivalent to the problem of finding a minimum containment-free intersection repre- 
sentation. In order to show the hardness of this problem, we reduce the Intersection Number 
problem to it, since Intersection Number is known to be NP-complete [9]. 

Theorem 20. CF-Overlap Number is NP-complete. 

Proof. Given an instance G = (V, E) and k of Intersection Number, with n = \V\, we construct 
the graph G' by adding, for each v G V, a new vertex v' that is adjacent only to v. Let V' be 
the set of all new vertices in G' , and let E' be the set of all edges in G' incident on a vertex in 
V' . The instance of CF-Overlap Number is then given by G' and k + 2n, which can clearly be 
constructed in polynomial time. 

Notice that any containment-free overlap representation forms a containment free intersection 
representation, and further that in any containment-free intersection representation for G' , the sets 
S v and S v > associated with a vertex v G V and v' G V must share a common element, as these 
vertices are adjacent, and furthermore, since v' is adjacent only to v, this element is only be found 
in S v and S v /. The set S v / is not contained in S v , and so it must contain at least one other element, 
which is unique to the set S v >, since v' is adjacent only to v. This implies that for all v G V, there 
are at least two elements found only in one or both of S v and S v r, which ensures that there are 
no containment relationships between any sets of the representation. Since these elements suffice 
to represent the vertices in V', and the representation is already containment free, the remaining 
elements of the representation form an arbitrary intersection representation for G. Hence, the 
containment-free overlap number of Q is exactly e {G) + 2n, where e {G) is the size of a minimum 
intersection representation for G. Thus G has an intersection representation of size k if and only if 
G' has a containment-free overlap representation of size k + 2n. □ 
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4.3 Overlap Representations with Limited Containment 

We can extend the hardness of the CF- Overlap Number problem to the problem of finding 
a minimum overlap representation of a graph, using at most a constant number of containment 
relationships between sets of the representation. Formalized as a decision problem, we consider the 
following problem. The factor of 2 appears since the nonadjacent pairs {u, v) and (v, u) are both 
counted, but we refer to the single non-edge as a containment relationship. 

Problem. The L-Containment Overlap Number problem, for any natural number L, is de- 
fined as: 

Instance: A graph, G = (V,E), and a natural number k. 

Question: Is there is some collection C = {S v : v £ V} that forms an overlap representation, such 
that \{J veV s v\<k and \{{u, v) g E : u ^ v and S u D S v / 0}| < 2L? 

This problem, when L = is exactly the CF-Overlap Number problem and so by Theorem 1201 
it is NP-complete in this case. For any constant L, a simple Turing reduction from the CF- 
Overlap Number problem is given by making 2L+1 copies of the input graph, and then finding an 
overlap representation with no more than L containments, which, by the pigeonhole principle, must 
leave at least one copy of G containment free, both internally, and with respect to other components 
of the graph. Furthermore, if we have a minimum representation, then this representation for G 
must also be minimum, as the sets associated with the vertices of this copy of G are disjoint from 
the sets associated with vertices in any other copy. With a little more work, we can find a many-one 
reduction from the CF-Overlap Number problem, by adding to the graph G extra components 
where a minimum representation is be compelled to "spend" all L set containments, leaving G with 
a containment-free representation. We can do this in such a way that we can track the number of 
elements these extra components add to the representation. To show this result we make use of 
Corollary [61 which gives an upper bound on the number of elements we are able to save by allowing 
containment relationships between components of the constructed graph. 

Theorem 21. For any L G N, L-Containment Overlap Number is NP-complete. 

Proof. Let G = (V, E) and k be an instance of CF-Overlap Number. We set n = \ V\, and we 
consider only cases where n > 4, as smaller cases can be solved as part of the transformation by 
searching all possible representations and producing as output a trivial yes or no instance. In the 
instance we construct, we add 2L components to the graph G. Each of these components is given 
by the graph Bi = (Vi,Ei), which is constructed from n + 1 disjoint edges, with three nonadjacent 
universal vertices, as shown in Figure [2j More formally, the vertices Vi and the edges E{ of each 
component Bi are given by 

Vi = {vij : 1 < j < 2n + 2} U {xi,yi,Zi}, 

Ei = {(«i,2j-i,Ui,2j) : 1 < 3 < n + 1} U 

{(xi,Vij), (yi,Vij), (zi,v itj ) : 1 < j < 2n + 2}. 

The graph in the constructed instance of L-Containment Overlap Number is then given by 
a disjoint union, H = G + B\+ B2 + • ■ ■ + B2L, of 2L of these new components with the graph G. 
The value k 1 is set to 

k' = k + 3L{n + l) +4L(n + l), (3) 
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Figure 2: Example of Bi with n = 4. 

to complete the instance of L-Containment Overlap Number. 

Before showing that the given instance of the CF-Overlap Number problem is equivalent 
to the constructed instance, we first make some observations about overlap representations of the 
graphs Bi. In a minimum containment-free overlap representation for the vertices Vij of Bi, we 
must use 3(n + l) elements, as each disjoint edge (yi^j-i, Vi,2j) requires at least three new elements 
in the representation. Furthermore, these three elements are given by an element unique to S Vi 2 ._ 1 , 
an element unique to S Vi 2j , and an element in the intersection of these two sets. We can extend a 
minimum representation for these vertices to include Xi and yi without increasing the size of the 
representation. To do this, we set S Xi to be those elements in common to the sets associated with 
both endpoints of each edge (vi^j—i, Vi,2j)> and we set S Vi to the elements that are unique to each of 
these sets. Since n + 1 > 2, these sets are not be contained in any set S Vij , and so these sets overlap, 
as desired. This forms the unique (up to permutation of the elements) minimum containment-free 
representation for all the vertices of Bi except Zj. If we allow a single containment relationship, we 
can set S Zi = S Xi to obtain a representation with size 3(n + 1). 

If we seek a containment-free overlap representation for Bi the situation is more bleak, as we 
cannot extend the unique minimum containment representation for every vertex except Z{ without 
adding new elements. This is because we still must use three elements to represent each edge 
(vi^j-ijVi^j), but there is no partition of these elements into three sets such that each set overlaps 
both S Vi2j _ 1 and S Vi2j . We are required then to use four elements for each edge (vi t 2j-i,Vi,2j)i 
with two elements in common to the sets associated with the endpoints, which brings the size of 
a minimum containment-free overlap representation of Bi to 4(n + 1). The key to the remainder 
of the proof is that by allowing a single containment relationship, we can reduce the size of the 
representation for some component Bi by n + 1 elements. 

If C is a minimum L-overlap representation for H, of size no more than k' = k + 3L(n + 1) + 
4L(n + 1), we show that G has a containment-free overlap representation of size not more than k. 
We claim that, as C is minimum, the representation C when restricted to G is already containment- 
free, and in fact, the L containment relationships can be found in L of the components Bi. To 
show this, we examine the other potential cases for a non-edge to be represented by containment, 
showing in each one that we can make a local transformation to move the containment relationship 
to some component Bi, in the process reducing the size of the representation, contradicting the 
optimality of C. 

The first such case we consider is any containment within the representation of G, which is, 
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two vertices u and v such that S u C S^. We replace SV, with re — 1 new elements, ai, ai, . . . , a n _i, 
to obtain = {ai, 02, ■ ■ ■ , a re _i}. This removes the containment relationship between u and 
v, and forces S' v not to contain any other set in the representation. In order to ensure that the 
representation is still valid, we modify the sets associated with some of the other vertices. There are 
exactly three ways a set can interact with S' v : we can have the set we consider contain S' v , the two 
sets can be disjoint, or the two sets can overlap. We consider, for each of these three interactions, 
how to alter the set to maintain a valid overlap representation of H. For any vertex w with S v C S w , 
we replace the set S w with the set S' w = S W US' V , to ensure that this containment relationship is not 
altered. This alteration does not affect the overlap, containment, or disjointedness relationships of 
the set S w , as these are new elements, and by transitivity, we have added these new elements to 
any set that contains S w . If w is a vertex such that S w and S v are disjoint, then S w and S' v must 
also be disjoint, and there is nothing to do in this case. If w is such that S w and S v overlap, then 
we have S' v H S w = 0, which we correct by setting S' w = S w U {ai}, for an element Oj € S' v that we 
have not already used for this purpose. This forces S' w and S' v to overlap, as the conditions that 
re > 4 and S w overlaps the set S v ensure that there are least two elements in each of these sets. We 
must also add the element Oj to any set that contains S w to preserve this containment relationship. 
This does not affect the representation of any vertex but v, as the sets that the element a% is being 
added to must also intersect S V} and we do not add all of the Oj to a set that should not contain 
S' v , since there are at most n — 2 vertices that are adjacent to v. Thus, we can remove at least 
one containment relationship from G, by adding n — 1 new elements to the representation. Since 
there are 2L components Bi, and only L — 1 remaining containments, there must be some i for 
which the vertices of Bi are involved in no containment relationships. We can use the containment 
we just removed from G to reduce the size of the representation for Bi from 4(n + 1) to 3(re + 1), 
which, in total, saves at least n + 1 — (n— 1) = 2 elements from the representation, contradicting 
the assumption that C was minimal. Thus, the vertices of G are not involved in any containment 
relationships in C. 

The second case of a containment relationship is one internal to one of the components Bi . If this 
containment is between two vertices Vij and v^k, we can simply replace the representation for Vij 
and the vertex t>jj±i it forms an edge with. This is done by setting S Vi . = {a±, £13, a^} and S Vi . ±1 = 
{02,03,04}, where the elements Oj are new to the representation. Finally, we add a\ and 02 to S Xi , 
03 to S Vi , and 04 to S Zi , being careful to add these elements to any set that contains these elements. 
If preserving these containment relationships results in all of {ai, 02, 03, 04} being contained in one 
of S Xi ,S yi , or 5 Zj , we simply add 03 to each of S Xt ,S Vi and S Zi , and remove 01,02, and 04 from 
these sets, once again being careful to preserve any containment relationships. This replacement 
removes the containment between Vij and vn~, and leaves a valid overlap representation. As the 
cost of this alteration was only four elements, and we can apply the freed containment relationship 
to some other component Bj to save n+1 containments, this also contradicts the optimality of C. 

The only remaining case for a containment internal to Bi is one between two of Xi,yi, and Zi, as 
these vertices are adjacent to all other vertices of Bi. We must also consider the case that between 
the vertices Xi,yi and z% there are two or more containments, but since we can extend a minimum 
representation for the vertices Vij to these vertices using only one containment, we can again apply 
this containment elsewhere, contradicting the optimality of C. 

The final case we must consider is a containment relationship between two vertices in differing 
components of H. Let U and W be the vertex sets of the two components, where for some vertex 
u G U and w G W we have S u C S w . Let A = \J u€U S u . Lemma [5] implies that for any vertex 
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in v £ W, either S v contains A or it is disjoint from it. The elements of A then, within W, act 
as a single element. This allows these elements to be replaced with a single new element, where 
once again, whenever we add a new element to a set we must also add this new element to any sets 
that contained the original set. After this replacement has been made, we have removed at least 
one containment relationship, at a cost of one new element in the representation, which once again 
contradicts the optimality of C. 

Thus, a minimum L-containment overlap representation for H uses containment only between 
the vertices Xi,yi, and Zi, and uses at most one containment per triple of vertices. Thus, in a 
minimum overlap representation, we have a containment-free overlap representation for G, and L 
of the Bi, and we have an overlap representation using only one containment for the remaining L 
of the Bi. Then, where we r is the containment- free overlap number of G, this representation has 
size r + 4L(n + 1) + 3L(n + 1), which by Equation ([3]) is less than k' only when r < k, as desired. 

Fortunately, the other direction is simple. If we take any containment-free overlap representation 
for G of size no more than k, we can form the representations discussed above for each Bi, by 
simply using three elements per edge (vj 2j— l? v i,2j) f° r L of the Bi and four elements per edge for 
the remaining L. Placing the containments in appropriate places, we can find an L-containment 
overlap representation for H of size no more than k + 3L(n + 1) + 4L(n + 1) = k' , as required. □ 

5 Conclusion 

There are many open problems related to the overlap number of a graph. Foremost among these 
unanswered questions is the complexity of computing the overlap number. 

Problem. The Overlap Number problem is defined as: 

Instance: A graph, G = (V,E), and an integer k. 

Question: Is there an overlap representation C = {S v : v G V} of G with |Ui><=y &v\ — kl 

This problem is clearly in NP, as it is a simple matter to verify that a given representation 
is both correct and of the appropriate size, and the evidence suggests that this problem is also 
complete for NP. There are also many class of graphs for which no algorithm to find a minimum 
overlap representation is known. Many of these classes, such as cographs, are classes of graphs 
for which many other combinatorial problems are tractable, and so there is reason to believe that 
efficient algorithms exist to compute the overlap number on some such classes of graphs, but they 
have yet to be discovered. 
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Appendix 

This appendix contains the proofs that have been omitted from the main text. 

Proofs Omitted From Section [3] 
Lemma 1111 For 1 < k < n, 
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as in the statement of the lemma. □ 
Theorem 1121 For n > 1, 

minjm : n < (j^i!+2j)} G ©Oogn)- 

Proof. Let x = min|m : n < (i m+2j)|. We show a lower bound by observing that there are 2 X 
subsets of {1, 2, . . . , x}, and so we must have n < 2 X , which implies that x £ J7(logn). Turning to an 
upper bound, notice that, by the definition of x, we have (i^+ij) < n. Using this, and Lemma \TT\ 

we have 
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We can then take logarithms to obtain 



logn > log ( 2(*- 1 >/y - / -, ) >^-i + ilo : ' 
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Since x G Q(logn), for large enough n we must have log (x + 1) < cc/2. We then have, by the 
above, and setting C = log(47r)/2, 

i x-l x x-2 

] ogn> — ---C= — -C, 

which is x/4 < logn + 1/2 + C, and so we have x € O(logn), as desired. □ 
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